Bayesian group factor analysis with structured sparsity

نویسندگان

  • Shiwen Zhao
  • Chuan Gao
  • Sayan Mukherjee
  • Barbara E. Engelhardt
چکیده

Latent factor models are the canonical statistical tool for exploratory analyses of lowdimensional linear structure for an observation matrix with p features across n samples. We develop a structured Bayesian group factor analysis model that extends the factor model to multiple coupled observation matrices; in the case of two observations, this reduces to a Bayesian model of canonical correlation analysis. The main contribution of this work is to carefully define a structured Bayesian prior that encourages both element-wise and column-wise shrinkage and leads to desirable behavior on high-dimensional data. In particular, our model puts a structured prior on the joint factor loading matrix, regularizing at three levels, which enables element-wise sparsity and unsupervised recovery of latent factors corresponding to structured variance across arbitrary subsets of the observations. In addition, our structured prior allows for both dense and sparse latent factors so that covariation among either all features or only a subset of features can both be recovered. We use fast parameter-expanded expectation-maximization for parameter estimation in this model. We validate our method on both simulated data with substantial structure and real data, comparing against a number of state-of-the-art approaches. These results illustrate useful properties of our model, including i) recovering sparse signal in the presence of dense effects; ii) the ability to scale naturally to large numbers of observations; iii) flexible observationand factor-specific regularization to recover factors with a wide variety of sparsity levels and percentage of variance explained; and iv) tractable inference that scales to modern genomic and document data sizes. c © . ar X iv :1 41 1. 26 98 v2 [ st at .M E ] 1 1 N ov 2 01 5

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian group latent factor analysis with structured sparsity

Latent factor models are the canonical statistical tool for exploratory analyses of lowdimensional linear structure for an observation matrix with p features across n samples. We develop a Bayesian group factor analysis (BGFA) model that extends the factor model to multiple coupled observation matrices. Our model puts a structured Bayesian hierarchical prior on the joint factor loading matrix, ...

متن کامل

Bayesian Sparsity for Intractable Distributions

Bayesian approaches for single-variable and group-structured sparsity outperform L1 regularization, but are challenging to apply to large, potentially intractable models. Here we show how noncentered parameterizations, a common trick for improving the efficiency of exact inference in hierarchical models, can similarly improve the accuracy of variational approximations. We develop this with two ...

متن کامل

Bayesian Structured Sparsity from Gaussian Fields

Substantial research on structured sparsity has contributed to analysis of many different applications. However, there have been few Bayesian procedures among this work. Here, we develop a Bayesian model for structured sparsity that uses a Gaussian process (GP) to share parameters of the sparsity-inducing prior in proportion to feature similarity as defined by an arbitrary positive definite ker...

متن کامل

Group Sparsity in Nonnegative Matrix Factorization

A recent challenge in data analysis for science and engineering is that data are often represented in a structured way. In particular, many data mining tasks have to deal with group-structured prior information, where features or data items are organized into groups. In this paper, we develop group sparsity regularization methods for nonnegative matrix factorization (NMF). NMF is an effective d...

متن کامل

Bayesian Optimal Approximate Message Passing to Recover Structured Sparse Signals

We present a novel compressed sensing recovery algorithm – termed Bayesian Optimal Structured Signal Approximate Message Passing (BOSSAMP) – that jointly exploits the prior distribution and the structured sparsity of a signal that shall be recovered from noisy linear measurements. Structured sparsity is inherent to group sparse and jointly sparse signals. Our algorithm is based on approximate m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2016